Robust speech recognition using DNN-HMM acoustic model combining noise-aware training with spectral subtraction

نویسندگان

  • Akihiro Abe
  • Kazumasa Yamamoto
  • Seiichi Nakagawa
چکیده

Recently, acoustic models based on deep neural notworks (DNNs) have been introduced and showed dramatic improvements over acoustic models based on GMM in a variety of tasks. In this paper, we considered the improvement of noise robustness of DNN. Inspired by Missing Feature Theory and static noise aware training, we proposed an approach that uses a noise-suppressed acoustic feature and estimated noise information as input of DNN. We used simple Spectral Subtraction as noise-suppression. As noise estimation, we used estimation per utterance or frame. In noisy speech recognition experiments, we compared the proposed method with other methods and the proposed method showed the superior performance than the other approaches. For noise estimation per utterance with log Mel Filterbank, we obtained 28.6% word error rate reduction compared with multi condition training, 5.9% reduction compared with noise adaptive training.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spectral subtraction in noisy environments applied to speaker adaptation based on HMM sufficient statistics

Noise and speaker adaptation techniques are essential to realize robust speech recognition in real noisy environments . In this paper, we applied spectral subtraction to an unsupervised speaker adaptation algorithm in noisy environments. The adaptation algorithm consists of the following five steps. (1) Spectral subtraction is carried out for noise added database. (2) Noise matched acoustic mod...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

Robust Speech Recognition Using Generalized Distillation Framework

In this paper, we propose a noise robust speech recognition system built using generalized distillation framework. It is assumed that during training, in addition to the training data, some kind of ”privileged” information is available and can be used to guide the training process. This allows to obtain a system which at test time outperforms those built on regular training data alone. In the c...

متن کامل

Improved HMM Separation for Distant-Talking Speech Recognition

In distant-talking speech recognition, the recognition accuracy is seriously degraded by reverberation and environmental noise. A robust speech recognition technique in such environments, HMM separation and composition, has been described in [1]. HMM separation estimates the model parameters of the acoustic transfer function using adaptation data uttered from an unknown position in noisy and re...

متن کامل

Robust i-vector based adaptation of DNN acoustic model for speech recognition

In the past, conventional i-vectors based on a Universal Background Model (UBM) have been successfully used as input features to adapt a Deep Neural Network (DNN) Acoustic Model (AM) for Automatic Speech Recognition (ASR). In contrast, this paper introduces Hidden Markov Model (HMM) based ivectors that use HMM state alignment information from an ASR system for estimating i-vectors. Further, we ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015